309 research outputs found
A Globally Convergent LCL Method for Nonlinear Optimization
For optimization problems with nonlinear constraints, linearly constrained
Lagrangian (LCL) methods sequentially minimize a Lagrangian function subject to
linearized constraints. These methods converge rapidly near a solution but may
not be reliable from arbitrary starting points. The well known example \MINOS\
has proven effective on many large problems. Its success motivates us to
propose a globally convergent variant. Our stabilized LCL method possesses two
important properties: the subproblems are always feasible, and they may be
solved inexactly. These features are present in \MINOS only as heuristics.
The new algorithm has been implemented in \Matlab, with the option to use
either the \MINOS or \SNOPT Fortran codes to solve the linearly constrained
subproblems. Only first derivatives are required. We present numerical results
on a nonlinear subset of the \COPS, \CUTE, and HS test-problem sets, which
include many large examples. The results demonstrate the robustness and
efficiency of the stabilized LCL procedure.Comment: 34 page
Tail bounds for stochastic approximation
Stochastic-approximation gradient methods are attractive for large-scale
convex optimization because they offer inexpensive iterations. They are
especially popular in data-fitting and machine-learning applications where the
data arrives in a continuous stream, or it is necessary to minimize large sums
of functions. It is known that by appropriately decreasing the variance of the
error at each iteration, the expected rate of convergence matches that of the
underlying deterministic gradient method. Conditions are given under which this
happens with overwhelming probability
Low-rank spectral optimization via gauge duality
Various applications in signal processing and machine learning give rise to
highly structured spectral optimization problems characterized by low-rank
solutions. Two important examples that motivate this work are optimization
problems from phase retrieval and from blind deconvolution, which are designed
to yield rank-1 solutions. An algorithm is described that is based on solving a
certain constrained eigenvalue optimization problem that corresponds to the
gauge dual which, unlike the more typical Lagrange dual, has an especially
simple constraint. The dominant cost at each iteration is the computation of
rightmost eigenpairs of a Hermitian operator. A range of numerical examples
illustrate the scalability of the approach.Comment: Final version. To appear in SIAM J. Scientific Computin
Efficient evaluation of scaled proximal operators
Quadratic-support functions [Aravkin, Burke, and Pillonetto; J. Mach. Learn.
Res. 14(1), 2013] constitute a parametric family of convex functions that
includes a range of useful regularization terms found in applications of convex
optimization. We show how an interior method can be used to efficiently compute
the proximal operator of a quadratic-support function under different metrics.
When the metric and the function have the right structure, the proximal map can
be computed with cost nearly linear in the input size. We describe how to use
this approach to implement quasi-Newton methods for a rich class of nonsmooth
problems that arise, for example, in sparse optimization, image denoising, and
sparse logistic regression.Comment: 23 page
Gauge optimization and duality
Gauge functions significantly generalize the notion of a norm, and gauge
optimization, as defined by Freund (1987}, seeks the element of a convex set
that is minimal with respect to a gauge function. This conceptually simple
problem can be used to model a remarkable array of useful problems, including a
special case of conic optimization, and related problems that arise in machine
learning and signal processing. The gauge structure of these problems allows
for a special kind of duality framework. This paper explores the duality
framework proposed by Freund, and proposes a particular form of the problem
that exposes some useful properties of the gauge optimization framework (such
as the variational properties of its value function), and yet maintains most of
the generality of the abstract form of gauge optimization.Comment: 24 p
Variational properties of value functions
Regularization plays a key role in a variety of optimization formulations of
inverse problems. A recurring theme in regularization approaches is the
selection of regularization parameters, and their effect on the solution and on
the optimal value of the optimization problem. The sensitivity of the value
function to the regularization parameter can be linked directly to the Lagrange
multipliers. This paper characterizes the variational properties of the value
functions for a broad class of convex formulations, which are not all covered
by standard Lagrange multiplier theory. An inverse function theorem is given
that links the value functions of different regularization formulations (not
necessarily convex). These results have implications for the selection of
regularization parameters, and the development of specialized algorithms.
Numerical examples illustrate the theoretical results.Comment: 30 page
Recovering Compressively Sampled Signals Using Partial Support Information
In this paper we study recovery conditions of weighted minimization
for signal reconstruction from compressed sensing measurements when partial
support information is available. We show that if at least 50% of the (partial)
support information is accurate, then weighted minimization is stable
and robust under weaker conditions than the analogous conditions for standard
minimization. Moreover, weighted minimization provides better
bounds on the reconstruction error in terms of the measurement noise and the
compressibility of the signal to be recovered. We illustrate our results with
extensive numerical experiments on synthetic data and real audio and video
signals.Comment: 22 pages, 10 figure
Smooth Structured Prediction Using Quantum and Classical Gibbs Samplers
We introduce two quantum algorithms for solving structured prediction
problems. We show that a stochastic subgradient descent method that uses the
quantum minimum finding algorithm and takes its probabilistic failure into
account solves the structured prediction problem with a runtime that scales
with the square root of the size of the label space, and in with respect to the precision, , of the
solution. Motivated by robust inference techniques in machine learning, we
introduce another quantum algorithm that solves a smooth approximation of the
structured prediction problem with a similar quantum speedup in the size of the
label space and a similar scaling in the precision parameter. In doing so, we
analyze a stochastic gradient algorithm for convex optimization in the presence
of an additive error in the calculation of the gradients, and show that its
convergence rate does not deteriorate if the additive errors are of the order
. This algorithm uses quantum Gibbs sampling at temperature
as a subroutine. Based on these theoretical observations,
we propose a method for using quantum Gibbs samplers to combine feedforward
neural networks with probabilistic graphical models for quantum machine
learning. Our numerical results using Monte Carlo simulations on an image
tagging task demonstrate the benefit of the approach
Fast Dual Variational Inference for Non-Conjugate LGMs
Latent Gaussian models (LGMs) are widely used in statistics and machine
learning. Bayesian inference in non-conjugate LGMs is difficult due to
intractable integrals involving the Gaussian prior and non-conjugate
likelihoods. Algorithms based on variational Gaussian (VG) approximations are
widely employed since they strike a favorable balance between accuracy,
generality, speed, and ease of use. However, the structure of the optimization
problems associated with these approximations remains poorly understood, and
standard solvers take too long to converge. We derive a novel dual variational
inference approach that exploits the convexity property of the VG
approximations. We obtain an algorithm that solves a convex optimization
problem, reduces the number of variational parameters, and converges much
faster than previous methods. Using real-world data, we demonstrate these
advantages on a variety of LGMs, including Gaussian process classification, and
latent Gaussian Markov random fields.Comment: 9 pages, 3 figure
A perturbation view of level-set methods for convex optimization
Level-set methods for convex optimization are predicated on the idea that
certain problems can be parameterized so that their solutions can be recovered
as the limiting process of a root-finding procedure. This idea emerges time and
again across a range of algorithms for convex problems. Here we demonstrate
that strong duality is a necessary condition for the level-set approach to
succeed. In the absence of strong duality, the level-set method identifies
-infeasible points that do not converge to a feasible point as
tends to zero. The level-set approach is also used as a proof
technique for establishing sufficient conditions for strong duality that are
different from Slater's constraint qualification
- …